Retrieval of Research-level Mathematical Information Needs: A Test Collection and Technical Terminology Experiment
نویسندگان
چکیده
In this paper, we present a test collection for mathematical information retrieval composed of real-life, researchlevel mathematical information needs. Topics and relevance judgements have been procured from the on-line collaboration website MathOverflow by delegating domain-specific decisions to experts on-line. With our test collection, we construct a baseline using Lucene’s vectorspace model implementation and conduct an experiment to investigate how prior extraction of technical terms from mathematical text can affect retrieval efficiency. We show that by boosting the importance of technical terms, statistically significant improvements in retrieval performance can be obtained over the baseline.
منابع مشابه
Mathematical Information Retrieval based on Type Embeddings and Query Expansion
We present an approach to mathematical information retrieval (MIR) that exploits a special kind of technical terminology, referred to as a mathematical type. In this paper, we present and evaluate a type detection mechanism and show its positive effect on the retrieval of research-level mathematics. Our best model, which performs query expansion with a type-aware embedding space, strongly outpe...
متن کاملبررسی تطبیقی اصطلاحنامه معارف اسلامی و علوم قرآنی
This study examines the comparative strengths and weaknesses of the thesaurus and thesaurus Quranic teachings of the Koran. In today's society where the documents are kept electronically, retrieval and dissemination of information for the development of research, much greater importance of saving documents and thesaurus that is the basis for indexing in various sciences, One of the solutions fo...
متن کاملBuilding Corpora of Technical Texts: Approaches and Tools
Building corpora of technical texts in Science, Technology, Engineering, and Mathematics (STEM) domain has its specific needs, especially the handling of mathematical formulae. In particular, there is no widely accepted format to represent and handle math. We present an approach based on multiple representations of mathematical formulae that has been used for math retrieval, similarity and clus...
متن کاملدیداری کردن نتایج جستوجو در فرایند بازیابی اطلاعات
Purpose: One of the most effective ways to achieve optimum information retrieval is through visualization of Information. Search strategies, probing skills, querying of information needs and analysis of information play a significant role in the accessing of necessary and useful information. Besides the factors mentioned above, information visualization can increase the availability level of in...
متن کاملCreating a Dutch Information Retrieval Test Corpus
This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch test data, which is part of the official CLEF multilingual test corpus, and give an overview of the experimental results of companies and research institutions that participated in the first official Dutch CLEF experimen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015